Instrumental Variables Application and Limitations Edwin
نویسندگان
چکیده
To correct for confounding, the method of instrumental variables (IV) has been proposed. Its use in medical literature is still rather limited because of unfamiliarity or inapplicability. By introducing the method in a nontechnical way, we show that IV in a linear model is quite easy to understand and easy to apply once an appropriate instrumental variable has been identified. We also point out some limitations of the IV estimator when the instrumental variable is only weakly correlated with the exposure. The IV estimator will be imprecise (large standard error), biased when sample size is small, and biased in large samples when one of the assumptions is only slightly violated. For these reasons, it is advised to use an IV that is strongly correlated with exposure. However, we further show that under the assumptions required for the validity of the method, this correlation between IV and exposure is limited. Its maximum is low when confounding is strong, such as in case of confounding by indication. Finally, we show that in a study in which strong confounding is to be expected and an IV has been used that is moderately or strongly related to exposure, it is likely that the assumptions of IV are violated, resulting in a biased effect estimate. We conclude that instrumental variables can be useful in case of moderate confounding but are less useful when strong confounding exists, because strong instruments cannot be found and assumptions will be easily violated. (Epidemiology 2006;17: 260–267) In medical research, randomized, controlled trials (RCTs) remain the gold standard in assessing the effect of one variable of interest, often a specified treatment. Nevertheless, observational studies are often used in estimating such an effect. In epidemiologic as well as sociologic and economic research, observational studies are the standard for exploring causal relationships between an exposure and an outcome variable. The main problem of estimating the effect in such studies is the potential bias resulting from confounding between the variable of interest and alternative explanations for the outcome (confounders). Traditionally, standard methods such as stratification, matching, and multiple regression techniques have been used to deal with confounding. In the epidemiologic literature, some other methods have been proposed of which the method of propensity scores is best known. In most of these methods, adjustment can be made only for observed confounders. A method that has the potential to adjust for all confounders, whether observed or not, is the method of instrumental variables (IV). This method is well known in economics and econometrics as the estimation of simultaneous regression equations and is also referred to as structural equations and two-stage least squares. This method has a long tradition in economic literature, but has entered more recently into the medical research literature with increased focus on the validity of the instruments. Introductory texts on instrumental variables can be found in Greenland and Zohoori and Savitz. One of the earliest applications of IV in the medical field is probably the research of Permutt and Hebel, who estimated the effect of smoking of pregnant women on their child’s birth weight, using an encouragement to stop smoking as the instrumental variable. More recent examples can be found in Beck et al, Brooks et al, Earle et al, Hadley et al, Leigh and Schembri, McClellan, and McIntosh. However, it has been argued that the application of this method is limited because of its strong assumptions, making it difficult in practice to find a suitable instrumental variable. The objectives of this article are first to introduce the application of the method of IV in epidemiology in a nontechnical way and second, to show the limitations of this method, from which it follows that IV is less useful for solving large confounding problems such as confounding by indication. A SIMPLE LINEAR INSTRUMENTAL VARIABLES MODEL In an RCT, the main purpose is to estimate the effect of one explanatory factor (the treatment) on an outcome variable. Because treatments have been randomly assigned to individuals, the treatment variable is in general independent Submitted 8 February 2005; accepted 16 November 2005. From the *Department of Pharmacoepidemiology and Pharmacotherapy, Utrecht Institute of Pharmaceutical Sciences (UIPS), Utrecht University, Utrecht, The Netherlands; and the †Centre for Biostatistics, Utrecht University, Utrecht, The Netherlands. Supported by the Utrecht Institute of Pharmaceutical Sciences (UIPS), Utrecht University, Utrecht, The Netherlands. Correspondence: Olaf H. Klungel, Department of Pharmacoepidemiology and Pharmacotherapy, Utrecht Institute of Pharmaceutical Sciences (UIPS), Utrecht University, Sorbonnelaan 16, 3584 CA Utrecht, The Netherlands. E-mail: [email protected]. Copyright © 2006 by Lippincott Williams & Wilkins ISSN: 1044-3983/06/1703-0260 DOI: 10.1097/01.ede.0000215160.88317.cb Epidemiology • Volume 17, Number 3, May 2006 260 of other explanatory factors. In case of a continuous outcome and a linear model, this randomization procedure allows one to estimate the treatment effect by means of ordinary least squares with a well-known unbiased estimator (see, for instance, Pestman). In observational studies, on the other hand, one has no control over this explanatory factor (further denoted as exposure) so that ordinary least squares as an estimation method will generally be biased because of the existence of unmeasured confounders. For example, one cannot directly estimate the effect of cigarette smoking on health without considering confounding factors such as age and socioeconomic position. One way to adjust for all possible confounding factors, whether observed or not, is to make use of an instrumental variable. The idea is that the causal effect of exposure on outcome can be captured by using the relationship between the exposure and another variable, the instrumental variable. How this variable can be selected and which conditions have to be fulfilled is discussed subsequently. First, we illustrate the model and its estimator. The Instrumental Variables Model and Its Estimator A simple linear model for IV estimation consists of 2 equations:
منابع مشابه
Instrumental Variables Regression with Measurement Errors and Multicollinearity in Instruments
In this paper we obtain a consistent estimator when there exist some measurement errors and multicollinearity in the instrumental variables in a two stage least square estimation of parameters. We investigate the asymptotic distribution of the proposed estimator and discuss its properties using some theoretical proofs and a simulation study. A real numerical application is also provided for mor...
متن کاملGeneralised instrumental variable models
The ability to allow for flexible forms of unobserved heterogeneity is an essential ingredient in modern microeconometrics. In this paper we extend the application of instrumental variable (IV) methods to a wide class of problems in which multiple values of unobservable variables can be associated with particular combinations of observed endogenous and exogenous variables. In our Generalized In...
متن کاملGeneralized instrumental variable models
The ability to allow for flexible forms of unobserved heterogeneity is an essential ingredient in modern microeconometrics. In this paper we extend the application of instrumental variable (IV) methods to a wide class of problems in which multiple values of unobservable variables can be associated with particular combinations of observed endogenous and exogenous variables. In our Generalized In...
متن کاملUse of instrumental variables in the presence of heterogeneity and self-selection: An application in breast cancer patients
Instrumental variables methods (IV) are widely used in the health economics literature to adjust for hidden selection biases in observational studies when estimating treatment effects. Less attention has been paid in the applied literature to the proper use of instrumental variables if treatment effects are heterogeneous across subjects and individuals select treatments based on expected idiosy...
متن کاملApplication of Support Vector Machine for Detection of Functional Limitations in the Diabetic Patients of the Northwest of IRAN in 2017: A Descriptive Study
Background and Objectives: Support vector machine (SVM) is a robust and effective statistical method for the diagnosis and prediction of clinical outcomes based on combinations of predictor variables. The aim of this study was to use SVM to detect the functional limitations in the diabetic patients and evaluate the accuracy of this diagnosis. Materials and Methods: This descriptive study was c...
متن کامل